3,114 research outputs found

    Quantiles over data streams : experimental comparisons, new analyses, and further improvements

    Get PDF
    A fundamental problem in data management and analysis is to generate descriptions of the distribution of data. It is most common to give such descriptions in terms of the cumulative distribution, which is characterized by the quantiles of the data. The design and engineering of efficient methods to find these quantiles has attracted much study, especially in the case where the data are given incrementally, and we must compute the quantiles in an online, streaming fashion. While such algorithms have proved to be extremely useful in practice, there has been limited formal comparison of the competing methods, and no comprehensive study of their performance. In this paper, we remedy this deficit by providing a taxonomy of different methods and describe efficient implementations. In doing so, we propose new variants that have not been studied before, yet which outperform existing methods. To illustrate this, we provide detailed experimental comparisons demonstrating the trade-offs between space, time, and accuracy for quantile computation

    The Zeroth Law of Thermodynamics and Volume-Preserving Conservative Dynamics with Equilibrium Stochastic Damping

    Full text link
    We propose a mathematical formulation of the zeroth law of thermodynamics and develop a stochastic dynamical theory, with a consistent irreversible thermodynamics, for systems possessing sustained conservative stationary current in phase space while in equilibrium with a heat bath. The theory generalizes underdamped mechanical equilibrium: dx=gdt+{Dϕdt+2DdB(t)}dx=gdt+\{-D\nabla\phi dt+\sqrt{2D}dB(t)\}, with g=0\nabla\cdot g=0 and {}\{\cdots\} respectively representing phase-volume preserving dynamics and stochastic damping. The zeroth law implies stationary distribution uss(x)=eϕ(x)u^{ss}(x)=e^{-\phi(x)}. We find an orthogonality ϕg=0\nabla\phi\cdot g=0 as a hallmark of the system. Stochastic thermodynamics based on time reversal (t,ϕ,g)(t,ϕ,g)\big(t,\phi,g\big)\rightarrow\big(-t,\phi,-g\big) is formulated: entropy production ep#(t)=dF(t)/dte_p^{\#}(t)=-dF(t)/dt; generalized "heat" hd#(t)=dU(t)/dth_d^{\#}(t)=-dU(t)/dt, U(t)=Rnϕ(x)u(x,t)dxU(t)=\int_{\mathbb{R}^n} \phi(x)u(x,t)dx being "internal energy", and "free energy" F(t)=U(t)+Rnu(x,t)lnu(x,t)dxF(t)=U(t)+\int_{\mathbb{R}^n} u(x,t)\ln u(x,t)dx never increases. Entropy follows dSdt=ep#hd#\frac{dS}{dt}=e_p^{\#}-h_d^{\#}. Our formulation is shown to be consistent with an earlier theory of P. Ao. Its contradistinctions to other theories, potential-flux decomposition, stochastic Hamiltonian system with even and odd variables, Klein-Kramers equation, Freidlin-Wentzell's theory, and GENERIC, are discussed.Comment: 25 page

    CO adsorption on Cu(111) and Cu(001) surfaces: improving site preference in DFT calculations

    Full text link
    CO adsorption on Cu(111) and Cu(001) surfaces has been studied within ab-initio density functional theory (DFT). The structural, vibrational and thermodynamic properties of the adsorbate-substrate complex have been calculated. Calculations within the generalized gradient approximation (GGA) predict adsorption in the threefold hollow on Cu(111) and in the bridge-site on Cu(001), instead of on-top as found experimentally. It is demonstrated that the correct site preference is achieved if the underestimation of the HOMO-LUMO gap of CO characteristic for DFT is correct by applying a molecular DFT+U approach. The DFT+U approach also produces good agreement with the experimentally measured adsorption energies, while introducing only small changes in the calculated geometrical and vibrational properties further improving agreement with experiment which is fair already at the GGA level.Comment: 15 pages, 3 figures, submitted to Surf. Sci., WWW: http://cms.mpi.univie.ac.at/mgajdos

    Stochastic Thermodynamics Across Scales: Emergent Inter-attractoral Discrete Markov Jump Process and Its Underlying Continuous Diffusion

    Full text link
    The consistency across scales of a recently developed mathematical thermodynamic structure, between a continuous stochastic nonlinear dynamical system (diffusion process with Langevin or Fokker-Planck equations) and its emergent discrete, inter-attractoral Markov jump process, is investigated. We analyze how the system's thermodynamic state functions, e.g. free energy FF, entropy SS, entropy production epe_p, and free energy dissipation F˙\dot{F}, etc., are related when the continuous system is describe with a coarse-grained discrete variable. We show that the thermodynamics derived from the underlying detailed continuous dynamics is exact in the Helmholtz free-energy representation. That is, the system thermodynamic structure is the same as if one only takes a middle-road and starts with the "natural" discrete description, with the corresponding transition rates empirically determined. By "natural", we mean in the thermodynamic limit of large systems in which there is an inherent separation of time scales between inter- and intra-attractoral dynamics. This result generalizes a fundamental idea from chemistry and the theory of Kramers by including thermodynamics: while a mechanical description of a molecule is in terms of continuous bond lengths and angles, chemical reactions are phenomenologically described by the Law of Mass Action with rate constants, and a stochastic thermodynamics.Comment: 21 pages, 1 figur

    Genome-wide DNA methylation profiling by modified reduced representation bisulfite sequencing in Brassica rapa suggests that epigenetic modifications play a key role in polyploid genome evolution

    Get PDF
    Brassica rapa includes some of the most important vegetables worldwide as well as oilseed crops. The complete annotated genome sequence confirmed its paleohexaploid origins and provides opportunities for exploring the detailed process of polyploid genome evolution. We generated a genome-wide DNA methylation profile for B. rapa using a modified reduced representation bisulfite sequencing (RRBS) method. This sampling represented 2.24% of all CG loci (2.5 x 105), 2.16% CHG (2.7 x 105) and 1.68% CHH loci (1.05 x 105) (where H = A, T or C). Our sampling of DNA methylation in B. rapa indicated that 52.4% of CG sites were present as 5mCG, with 31.8% of CHG and 8.3% of CHH. It was found that genic regions of single copy genes had significantly higher methylation compared to those of two or three copy genes. Differences in degree of genic DNA methylation were observed in a hierarchical relationship corresponding to the relative age of the three ancestral subgenomes, primarily accounted by single-copy genes. RNA-seq analysis revealed that overall the level of transcription was negatively correlated with mean gene methylation content and depended on copy number or associated with the different subgenomes. These results provide new insights into the role epigenetic variation plays in polyploid genome evolution, and suggest an alternative mechanism for duplicate gene loss

    Cataract prevalence following a nationwide policy to shorten wait time for cataract surgery

    Get PDF
    Background: Cataract is an age-related eye disease. Visual impairment from cataract can be restored by cataract surgery. In 2004 the Canadian federal government invested in a multibillion dollar wait time strategy to shorten the wait time for cataract surgery, a government-insured health service in all Canadian jurisdictions. We assessed if this nationwide policy reduced the number of Canadians waiting for cataract surgery as more individuals with cataract were free of cataract following the rapidly conducted surgery. Methods: In this cross-sectional study we analyzed data from randomly selected individuals aged greater than or equal to 45 years responding to the Canadian Community Health Survey (CCHS) in 2000/2001, 2003, 2005, and the CCHS Healthy Aging in 2008/2009. Information on cataract was obtained from self-reported questionnaire. The age- and sex-standardized prevalence of cataract was calculated for comparisons. Results: Cataract was reported by 0.93 million Canadians in 2000/2001, 0.99 million in 2003, 1.10 million in 2005, and 1.34 million in 2008/2009. This corresponds to an age- and sex-standardized prevalence of 8.9% in 2000/2001, 9.0% in 2003, 9.5% in 2005, and 10.2% (P <0.05) in 2008/2009. The increase in age- and sex-standardized prevalence was greater in individuals without secondary school graduation than those with secondary school graduation or higher (4.3% versus 1.3%, P < 0.05) and was seen in all Canadian provinces. The largest increase was documented in a province (Saskatchewan, from 9.8% in 2000/2001 to 12.6% in 2008/2009, P < 0.05) with the longest median wait times for cataract surgery (118 days in 2008) and the lowest number of ophthalmologists per 100,000 population (1.96 versus 3.35 national average). Conclusions: The age- and sex-standardized prevalence of cataract increased 4-5 years after the multibillion-dollar wait time strategy was launched in 2004. A lower threshold to diagnose cataract may be one potential reason for this finding. Further research is needed to understand the true reasons for the increase

    Increased axonal bouton dynamics in the aging mouse cortex

    No full text
    Aging is a major risk factor for many neurological diseases and is associated with mild cognitive decline. Previous studies suggest that aging is accompanied by reduced synapse number and synaptic plasticity in specific brain regions. However, most studies, to date, used either postmortem or ex vivo preparations and lacked key in vivo evidence. Thus, whether neuronal arbors and synaptic structures remain dynamic in the intact aged brain and whether specific synaptic deficits arise during aging remains unknown. Here we used in vivo two-photon imaging and a unique analysis method to rigorously measure and track the size and location of axonal boutons in aged mice. Unexpectedly, the aged cortex shows circuit-specific increased rates of axonal bouton formation, elimination, and destabilization. Compared with the young adult brain, large (i.e., strong) boutons show 10-fold higher rates of destabilization and 20-fold higher turnover in the aged cortex. Size fluctuations of persistent boutons, believed to encode long-term memories, also are larger in the aged brain, whereas bouton size and density are not affected. Our data uncover a striking and unexpected increase in axonal bouton dynamics in the aged cortex. The increased turnover and destabilization rates of large boutons indicate that learning and memory deficits in the aged brain arise not through an inability to form new synapses but rather through decreased synaptic tenacity. Overall our study suggests that increased synaptic structural dynamics in specific cortical circuits may be a mechanism for age-related cognitive decline
    corecore